This python file is used to clean the vggsound dataset. Including renaming, removing modal missing samples, removing short duration samples, etc.